6
The Nature of Information
53
value with variety one, hence using (6.5), upper I equals 0I = 0 after the measurement; that is, he
considers the result to be known with certainty once it has been delivered. Hence,
it is considered to have zero information, and it is in this sense that an information
processor is also an information annihilator. Wiener considers the more general
case in which the result of the measurement could be less than certain (e.g., still a
distribution, but narrower than the one measured).
The gain of information upper II is equivalent to the removal of uncertainty; hence,
information could be defined as “that which removes uncertainty.” It corresponds to
the reduction of variety perceived by an observer and is inversely proportional to the
probability of a particular value being read, or a particular symbol (or set of symbols)
being selected, or, more generally, is inversely proportional to the probability of a
message being received and remembered.
Example. Anupper N times upper NN × N grid of pixels, each of which can be either black or white, can
convey at mostminus sigma summation Underscript i Overscript upper N squared Endscripts one half log Subscript 2 Baseline one half−ΣN 2
i
1
2 log2
1
2 bits of information. This maximum is achieved when
the probability of being either black or white is equal.
. I defined by Eqs. (6.4) and (6.5) has the properties that one may reasonably
postulate should be possessed by a measure of information, namely
1. .I (EN M) = I (EN) + I (EM) ,
for N, M = 1, 2, . . . ;
2. .I (EN) ≤I (EN+1) ;
3. .I (E2) = 1 .
Example. How much information is contained in a sequence of DNA? If each of
the four bases is chosen with equal probability (i.e., p equals one fourthp = 1
4), the information in
a decamer is 10 log Subscript 2 Baseline 4 equals 2010 log2 4 = 20 bits. It is the average number of yes/no questions that
would be needed to ascertain the sequence. If the sequence were completely unknown
before questioning, this is the gain in information. Any constraints imposed on the
assembly of the sequence—for example, a rule that “AA” is never followed by “T”,
will lower the information content of the sequence (i.e., the gain in information
upon receiving the sequence, assuming that those constraints are known to us).
Some proteins are heavily constrained; the antifreeze glycoprotein (alanine-alanine-
threonine)Subscript nn could be simply specified by the instruction “repeat AATnn times”, much
more compactly than writing out the amino acid sequence in full, and the quantity
of information gained upon being informed of the sequence, enunciated explicitly,
is correspondingly small.
Thermodynamic Entropy. One often encounters the word “entropy” used synony-
mously with information (or its removal). Entropy (upper SS) in a physical system represents
the ability of a system to absorb energy without increasing its temperature. Under
isothermal conditions (i.e., at a constant temperature upper TT ),
upper S equals k Subscript normal upper B Baseline ln upper W commadQ = T dS ,
(6.8)